Reinforcement learning of a continuous motor sequence with hidden states

نویسندگان

  • Hiroaki Arie
  • Tetsuya Ogata
  • Jun Tani
  • Shigeki Sugano
چکیده

Reinforcement learning is the scheme for unsupervised learning in which robots are expected to acquire behavior skills through self-explorations based on reward signals. There are some difficulties, however, in applying conventional reinforcement learning algorithms to motion control tasks of a robot because most algorithms are concerned with discrete state space and based on the assumption of complete observability of the state. Real-world environments often have partial observablility; therefore, robots have to estimate the unobservable hidden states. This paper proposes a method to solve these two problems by combining the reinforcement learning algorithm and a learning algorithm for a continuous time recurrent neural network (CTRNN). The CTRNN can learn spatiotemporal structures in a continuous time and space domain, and can preserve the contextual flow by a self-organizing appropriate internal memory structure. This enables the robot to deal with the hidden state problem. We carried out an experiment on the pendulum swing-up task without rotational speed information. As a result, this task is accomplished in several hundred trials using the proposed algorithm. In addition, it is shown that the information about the rotational speed of the pendulum, which is considered as a hidden state, is estimated and encoded on the activation of a context neuron.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

First Results with Instance - Based StateIdenti cation for Reinforcement

When a reinforcement learning agent's next course of action depends on information that is hidden from the sensors because of problems such as occlusion, restricted range, bounded eld of view and limited attention, we say the agent suuers from the hidden state problem. State identiication techniques use history information to uncover hidden state. Previous approaches to encoding history include...

متن کامل

Sequence Representation in Animals and Networks: Study of a Recurrent Network Trained with Reinforcement Learning

Neural encoding for sequence identi cation, memory and production is studied using an Elman-style recurrent network (Elman, 1990) is studied. Novel feature of this network is that learning is implemented using biologically plausible reinforcement learning paradigm. Findings from Tanji & Shima's (1994) experiments on monkeys indicate that there is sequence-speci c activity in the supplementary m...

متن کامل

Instance-based State Identiication for Reinforcement Learning

This paper presents instance-based state identiication, an approach to reinforcement learning and hidden state that builds disambiguat-ing amounts of short-term memory on-line, and also learns with an order of magnitude fewer training steps than several previous approaches. Inspired by a key similaritybetween learning with hidden state and learning in continuous geometrical spaces, this approac...

متن کامل

Learning movement sequences with a delayed reward signal in a hierarchical model of motor function

A key problem in reinforcement learning is how an animal is able to learn a sequence of movements when the reward signal only occurs at the end of the sequence. We describe how a hierarchical dynamical model of motor function is able to solve the problem of delayed reward in learning movement sequences using associative (Hebbian) learning. At the lowest level, the motor system encodes simple mo...

متن کامل

Hidden state and reinforcement learning with instance-based state identification

Real robots with real sensors are not omniscient. When a robot's next course of action depends on information that is hidden from the sensors because of problems such as occlusion, restricted range, bounded field of view and limited attention, we say the robot suffers from the hidden state problem. State identification techniques use history information to uncover hidden state. Some previous ap...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Advanced Robotics

دوره 21  شماره 

صفحات  -

تاریخ انتشار 2007